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Abstract 

The purpose of this study was to test the principle of modality by using audio to deliver verbal information 
when that information is designed to support non-verbal information such as animations in a computer-based 
lesson. This was done by comparing the effect of two types of audio support mechanisms- a simple support 
mechanism consisting of declarative statements explaining the animated sequence and a complex support 
mechanism consisting of questions and answers explaining the animated sequence- on undergraduate student 
achievement of conceptual, rule and procedure knowledge. A control group consisting of the same computer-based 
lesson without any auditory support of the animation was also employed. Learning was measured through drawing, 
terminology, and comprehension tests. The results indicate student achievement was not enhanced by the addition of 
auditory support. 



Introduction 

With computers coming equipped with multi- media features such as the ability to play sound and 
animation, teachers and instructional designers are able to develop and deliver lessons in novel ways using both the 
visual and auditory channels of students. But, is student performance improved by the use of both channels as 
opposed to the traditional delivery method relying solely on the visual channel? And, at what learning level (the 
factual, conceptual, rule & procedure level) is there improvement in performance when both channels are used? 

According to cognitive load theory (Kalyuga, Chandler and Sweller 1988, 1989; Sweller 1988; Baddeley 
1986, 1992) methods of instruction reducing working memory load in order facilitate the encoding and storing of the 
information in long-term memory are effective. One such method is dual coding theory (Sadoski and Paivio 2001; 
Clark and Paivio 1991; Paivio 1971, 1986, 1990). Dual coding theory assumes we have two information processing 
systems: a verbal system, comprised of words, whose strength lies in its sequentially ordered hierarchy, each bit of 
information paves the way for the next, and a non-verbal system whose strength lies in its synchronous (holistic) 
hierarchy. 

Using audio to deliver verbal information when that information is designed to support non-verbal 
information such as graphics, pictures, and animations can enhance the effect of using both the verbal and visual 
systems. This is known as the modality effect (Clark & Mayer 2003; Penney 1989; Paivio 1986). The modality 
effect (Clark & Mayer, 2003, pp. 93) states “people learn more deeply from multimedia lessons when words 
explaining concurrent animations or graphics are presented as speech rather than as onscreen text.” 

The studies reviewed indicate that the modality effect works well at improving student’s verbal recall of 
factual information (Mayer 1991, Mayer 1992, Barron 1993, Mousavi 1995, Mann 1995, Mayer 1996, Mayer 1998, 
Moreno 1999, Mayer 2001, and Moreno 2002). There is also indication that the modality effect works at improving 
student’s ability to solve problems (Mayer 1991, Mayer 1992, Barron 1993, Mayer 1994, Mousavi 1995, Mayer 
1996, Mayer 1998, Moreno 1999, Chuang 1999, Mayer 2001, Moreno 2002). However, there is limited information 
regarding the effect of modality of student achievement of learning concepts, rules and procedures. The studies 
conducted by Mayer (1998) and Moreno (1999, 2002) found evidence to support the positive effect of modality on 
student’s ability to learn conceptual information but these learning levels were not isolated and studied on their own. 
In these studies, student’s ability to recall facts, identify concepts, and solve problems were tested together. This 
leaves open the question of how effective is the use of modality if the goal of the lesson is to facilitate achievement 
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of conceptual and rule/procedure knowledge? This study seeks to begin filling in the gap in the literature by 
isolating these two intellectual skill learning-levels. 

Literature Review 

Studies Exploring The Effect Of Dual-Coding: Using The Visual And Verbal Channels 

Mayer, in his study on how computer based animations can be used to promote scientific understanding 
(1991) and in his study aimed at identifying the role of student’s spatial ability in learning from words and pictures 
(1994), found that undergraduate college students, given a lesson in an area of non-expertise, performed better on 
recall and problem solving tests when both the verbal and visual systems were utilized. Mayer also found that the 
effect of dual coding was enhanced when the verbal and visual information was presented concurrently, at the same 
time as the animation rather than before or after it (Mayer 1991, 1992, 1994). This finding was duplicated by 
Moreno (1999) who tested undergraduate college students with low prior knowledge of meteorology, ability to 
recall information about the process of lightning. Chuang (1999) found similar results working with seventh grade 
students in Taipei, Taiwan when studying the role of gender and field dependence/independence on the ability to 
solve math problems. 

Placing supporting text near the animation it is meant to support in known as the contiguity principle (Clark 
and Mayer 2000) or the split-attention effect (Chandler and Sweller 19992; Sweller, Chandler, Tierney, and Cooper 
1990; Tarmizi and Sweller 1988). The split -attention affect happens when students must divide their attention 
between multiple sources of information (Mousavi 1995). The split -attention effect can be reduced by placing 
printed words next to the animation they are supporting (Clark and Mayer 2000). 

The positive effect of dual coding in reducing cognitive load also was evident in studies exploring the 
impact of reducing cognitive load in the lesson summary. In his study to see if reducing cognitive load in lesson 
summaries would help increase student’s retention, Mayer (1996) found that undergraduate students performed 
better on tests of recall and problem solving when summaries included both illustrations and text. 

The results indicate the effectiveness of verbal, in the form of text, support of animation in reducing 
cognitive load. Animation complemented with a textual explanation enabled students to take greater advantage of 
their capability to process information on two levels by stimulating the visual system and by reducing the load 
placed on the verbal processing system. This re -shuffling of information in working memory increased their ability 
to make meaning out of the information in preparation for storage in long-term memory. The placement of the 
supporting textual explanation next to the animation further reduced cognitive load and enhanced performance. 

Studies Exploring The Effect Of Modality: Using The Spoken Word In Place Of The Written Word To 
Support The Visual Channel 

While animation helped to reduce cognitive load it was not reduced as much as it could be because both 
text and animation have to pass through the same (the visual) sensory channel (Mousavi 1995; Chandler and Sweller 
1992. This meant that students were forced to shift their attention between the text and the animation while going 
through the pattern recognition and selective perception processes. Miller (1956, pp. 85) referred to the limitation of 
the sensory register as our “channel capacity.’’ Channel capacity is the maximum amount of information we can hold 
in our sensory memory at any given point in time. 

When animation is supported by a spoken explanation, as opposed to a textual explanation, cognitive load 
is further reduced. This time the reduction comes through the way that information passes from the environment 
through sensory memory and into working memory (Chandler and Sweller 1992; Paivio 1986; Penney 1989). 

Mann (1995), in as study testing student’s ability to construct a solution to an educational problem, used 
temporal sound, spoken information that highlights or details static or moving visuals, in a computer-based lesson to 
more effectively use student’s channel capacity in sensory memory by using sound with text to support concurrent 
animation. Mann found that students were able to recall a greater amount of critical detail when temporal sound was 
used. 

While Mann used both written text and temporal sound simultaneously, Mayer (1998), in a study that tested 
student’s ability to recall how lightening works, compared the effect of using either temporal sound or written text to 
support concurrent animation and found that students were able to recall more information and perform better on 
problem solving tests if they received the lesson using temporal sound. 

Lai (2000), in her study testing the evarious types of visual illustrations on concept learning, also found that 
students receiving lessons using temporal sound to support static graphics performed better on matching tests than 
students receiving lessons that used text to perform the same function. 

Similar to spatial contiguity, the placement of text near the animation it supports, the contiguity affect also 
applies when temporal sound is used to support animation. Temporal contiguity occurs when visual and spoken 
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materials are presented simultaneously rather than successively (Moreno 1999). Moreno (1999), using a lesson on 
the process of lightning formation, found that learning was negatively impacted if the temporal sound was not 
placed concurrent with the animation it supported matching Mayer’s (1994) findings with written text. 

Using written text or temporal sound that closely matches the animation it supports without a lot of 
extraneous details is also an important factor for success. In a study on reducing the cognitive load in lesson 
summaries, Mayer (1996), using a lesson on the process of lightning formation, found that students receiving 
concise lesson summaries, that included both visual and verbal information, performed best on verbal recall and 
problem solving tests. Similarly, Moreno (2002) found that extraneous details hurt student performance when using 
the lesson on the process of lightning formation. 

Issues Of Instructional Consistency 

Instructional consistency (Canelos 1983; Gagne and Medsker 1996)) states that intellectual skills are 
hierarchical in nature and that lower order skills are prerequisite for learning higher order skills. Verbal skills 
comprise the base of the hierarchy and are the ability to recall factual information. Verbal skills are a prerequisite for 
learning discriminations. Discriminations, the ability to distinguish between things, come next and are a prerequisite 
for concept formation. Concepts are the ability to classify information based on its critical attributes and are a 
prerequisite for the learning of rules, which are the ability to specify the relationship between concepts. At the top of 
the hierarchy sits higher order rules, the ability to use multiple rules in order to perform a task or solve a problem 
(Gagne and Medsker 1996, pp. 32-33). 

Therefore, if the objective is to generate solutions the instructional unit must contain the rules/procedures, 
concepts, and facts that represent the prerequisite knowledge needed to solve the problem. In the studies reviewed 
animation supported by narration increased students ability to recall factual information and to generate solutions to 
problems. The majority of studies reviewed explored student’s ability to recall factual information and solve 
problems (Mayer 1991, 1992 1996; Mousavi 1995), to solve problems (Barron 1993; Mayer 19994, 2001), or to 
recall information (Mann 1995). These studies do not give us an indication of where or how animation supported by 
narration works in the instructional hierarchy. Is animation supported by concurrent narration effective at teaching 
concepts or rules/principles? Or is there something about the animation with narration that enables students to build 
connections among intellectual skills? 

This study aims to build on the existing knowledge base and begin to fill in the gaps in the literature by 
testing the hypothesis that animation supported by concurrent temporal sound is better at teaching concepts and 
rules/principles than animation alone. 



Purpose Of This Study 

The purpose of this study was to test the principle of modality by using audio to deliver verbal information 
when that information is designed to support non-verbal information such as animations in a computer-based lesson. 
This was done by comparing the effect of two types of audio support mechanisms - a simple support mechanism 
consisting of declarative statements explaining the animated sequence and a complex support mechanism consisting 
of questions and answers explaining the animated sequence - on undergraduate student achievement of conceptual, 
rule and procedure knowledge. The questions the current study seeks to provide insight into include; 

1. Do students receiving the treatment consisting of the simple audio support (declarative statements 
explaining the animation) perform better on tests of conceptual and rule & procedural knowledge 
than students receiving the treatment with animation alone? 

2. Do students receiving the treatment consisting of complex audio support (questions followed up 
with declarative statements explaining the animation) perform better on tests of conceptual and 
rule & procedural knowledge than students receiving the simple audio treatment and the treatment 
consisting of animation alone? 

Research Design And Methodology 

A posttest only design method was used. The placement and use of animation and temporal sound was 
derived based on the results of two pilot studies. In the first study a computer-based lesson using Dwyer’s (1977) 
lesson on the human heart, titled “The Heart And Its Functions”, was used to determine where to place the 
animation. Since the goal of the primary study was to test student’s ability to perform on tests of conceptual and 
rule/procedure knowledge, the lesson was designed using programmed instruction to ensure that adequate factual 
knowledge was achieved. An item analysis was completed to determine where to place the animation in the lesson 
for the subsequent study. A difficulty level of .60 was used as the cutoff meaning that any item with a difficulty 
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level below sixty percent was targeted for animated support. 

A series of four tests consisting of 20 questions each were used to assess student achievement. An 
identification test was used to assess factual knowledge. A drawing test and a terminology test were used to measure 
conceptual knowledge. A comprehension test was used to measure rule/procedure knowledge. The validity and 
reliability of the tests was reported by Dwyer and Moore (1978). Based on the outcomes of the pilot study, 18 
animations were developed and placed in the lesson adjacent to the textual material they were designed to support. 

A second pilot study was conducted in order to determine which of the animated sequences required the use 
of temporal support. Again, item difficulty was set at .60 with any items supported by animation scoring below sixty 
percent targeted for support using temporal sound. The same four tests were used to assess student achievement. 
Based on the results of this analysis two treatments, one using simple audio support and another using complex 
audio support, were developed to support the animations. 

For the primary study, eighty-eight undergraduate students were recruited from a management class, an 
educational psychology class, and an information systems class. These students were randomly assigned to one of 
three experimental groups: A control group that received the lesson with animation but no audio (NA) support. A 
treatment group assigned a lesson that used simple audio (SA) explanations, in the form of declarative sentences, in 
support of the animation. And, a treatment group assigned a lesson that used complex audio (CA) explanations, in 
the form of Questions and answers, in support of the animation. Twenty-nine students received the treatment with no 
audio support. Thirty students received the treatment with simple audio support. Twenty-nine students received the 
treatment with the complex audio support. All students received extra credit towards their final grade in the class for 
participating in the study. 

All three experimental groups (NA, SA, CA) received a treatment where the beginning of the lesson was 
comprised of programmed instruction to ensure the prerequisite factual knowledge was gained. However, due to the 
nature of the content it was impossible to deliver the lesson with each learning level isolated. Therefore, the 
programmed instruction section also contained conceptual information along with the factual information. The 
programmed instruction consisted of a web page containing one or two pieces of factual knowledge. This meant that 
these pages also contained animation or animation with temporal support if the item analysis indicated it was 
needed. 

After a series of three to four pages like this, students were asked to answer a series of practice questions 
based on the material just presented. If the student’s score was satisfactory they were able to move on to the next 
part of the lesson. If the score was unsatisfactory they student was brought back to the beginning of that series of 
content. There was no limit placed on the amount of time or the number of times the student could spend on one 
section. Once the student satisfactorily completed the programmed instruction segment they were given a pencil and 
paper drawing test in which they were asked to draw and label the main sections of the human heart. Once that was 
completed the students were asked to complete an online identification test. In this test a picture of a heart was 
presented with an arrow pointing to the section to be identified. Students were asked to select the name of the 
section from a list of four choices. 

The second part of the lesson was primarily focused on rule/procedure information although there was also 
some conceptual information presented. Students went through a series of web pages describing the flow of blood 
as it passes through the heart. Some pages contained animated support of the content and some pages contained 
animation along with temporal support. The control group (NA) lesson contained only animation. Where the SA 
group received temporal support for animation it was in the form of simple declarative sentences. For example, if 
the animation were designed to show that the ventricles are the thickest walled chambers of the heart, the student, 
when he/she selected the play button would see the animation and simultaneously hear a statement that said, “The 
ventricles are the thickest walled chambers of the heart.” 

The CA group received the same treatment as the SA group except that the temporal support was delivered 
in a question and answer format. Continuing the example above regarding the ventricles, a student in the CA group 
would hear while the animation was playing, “What are the thickest walled chambers of the heart? The ventricles are 
the thickest walled chambers of the heart.” 

The same voice was used in both treatment groups and the speed at which the animation was played was 
adjusted to fit the length of the temporal support. In all three experimental groups, students were allowed to replay 
the animation, as many times as they felt was necessary to understand the material. 

At the end of the lesson students in each group were asked to complete a terminology test where they were 
asked to complete a sentence by selecting the appropriate word or phrase from the choices provided, and a 
comprehension test where they were asked to answer a question by selecting the appropriate answer from a list of 
choices. The identification, terminology, and comprehension tests were built into the computer-based lesson and 
completed on-line. 
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Results And Implications 

ANOVA was conducted to compare the differences between the three experimental groups on scores on the 
four criterion tests. Alpha was set at the .05 level. Comparisons were made at two levels: using all items and using 
only the items identified by the item analysis. Comparisons were made using all four tests. 



Results Using All Items 



The table below details the mean and standard deviation for each treatment group counting all items 







Mean 


Std. 








Deviation 


drawing test 


control 


15.97 


3.257 




simple 


16.00 


3.28C 




complex 


17.31 


2.714 




Total 


16.42 


3.125 


identification test 


control 


17.62 


2.47C 




simple 


17.40 


2.094 




complex 


17.88 


2.215 




Total 


17.62 


2.247 


terminology test 


control 


11.93 


5.028 




simple 


12.70 


4.17C 




complex 


12.58 


5.077 




Total 


12.40 


4.714 


comprehension 


control 


11.00 


3.464 


test 


simple 


11.27 


3.60C 




complex 


12.08 


4.06S 




Total 


11.42 


3.688 


Test Total 


control 


56.52 


11.903 




simple 


57.37 


10.545 




complex 


59.73 


9.298 




Total 


57.80 


10.637 



The F statistic for the drawing test was 1.787; the F statistic for the identification test was .319; the F 
statistic for the terminology test was .218; the F statistic for the comprehension test was .621. The F statistic for the 
total of all tests was .659. These results indicate that there were no significant differences in the performance of 
students in each group. The reliability coefficient was .8612. 



Results Using Only Items Identified Through Item Analysis 

The table below details the mean and standard deviation for each treatment group for only the items 
identified as deficient through item analysis 







Mean 


Std. 








Deviation 


Drawing Test 


control 


3.28 


1.601 




simple 


2.97 


1.450 




complex 


3.93 


1.163 




Total 


3.39 


1.458 


Terminology 


control 


4.14 


2.489 


Test 


simple 


4.93 


2.243 




complex 


4.73 


2.539 




Total 


4.60 


2.416 


Comprehension 


control 


4.38 


1.635 
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Test 



Item Total 



simple 


4.60 


2.010 


complex 


4.69 


2.074 


Total 


4.55 


1.893 


control 


11.79 


4.378 


simple 


12.5C 


4.392 


complex 


13.27 


3.853 


Total 


12.49 


4.222 



The F statistic for the drawing test was 3.547; the F statistic for the identification test was .851; the F 
statistic for the comprehension test was .198. The F statistic for the item total on these tests was .835. These results 
indicate that there were no significant differences in the performance of students in each group. The reliability 
coefficient was .7778. 

Implications Of The Results 

The lack of significance in performance between the experimental groups indicates that using either 
animation or animation along with temporal support may be problematic when it comes to teaching concepts and 
rules/principles. These results are interesting in lieu of prior studies whose results indicated animation supported 
with temporal sound was effective at teaching facts and problem-solving skills. With factual knowledge being a 
prerequisite for learning concepts, rules/principles and rules/principle knowledge being a prerequisite for problem 
solving learning it was thought that animation supported by temporal sound would have been effective at teaching 
concepts and rules/principles. However, the results of this study do not support this hypothesis. 

These results do, however, suggest possibilities for further research. Some possible avenues for exploration 
include: What is it, of there is anything, about the nature of concepts and rules/procedures that may not make them 
amenable to animation based learning? Does temporal sound increase cognitive load when the lesson is aimed at 
conceptual or rule/principle information? Does the kind of content being presented limit the effectiveness of 
modality? Are methods other than simple declarative sentences or questions followed by answers better suited for 
teaching concepts and rules/procedures? 



Summary 

Using temporal sound to support animation in computer-based lessons has been effective when the goal is 
to teach factual knowledge. There is also indication that it is effective for teaching problem-solving skills. This 
study, however, did not find evidence to support that using animation with temporal sound to teach concepts, 
rules/principles is effective. The results indicate that there is a problem of instructional consistency when applying 
the modality effect to the learning of intellectual skills. 
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